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FAST PHASE RETRIEVAL FROM LOCAL CORRELATION MEASUREMENTS 


MARK IWEN, ADITYA VISWANATHAN, AND YANG WANG 


Abstract. We develop a fast phase retrieval method which can utilize a large class of local phase¬ 
less correlation-based measurements in order to recover a given signal x £ C“ (up to an unknown 
global phase) in near-linear O (dlog 4 d)-time. Accompanying theoretical analysis proves that the 
proposed algorithm is guaranteed to deterministically recover all signals x satisfying a natural 
flatness (i.e., non-sparsity) condition for a particular choice of deterministic correlation-based mea¬ 
surements. A randomized version of these same measurements is then shown to provide nonuniform 
probabilistic recovery guarantees for arbitrary signals x £ C . Numerical experiments demonstrate 
the method’s speed, accuracy, and robustness in practice all code is made publicly available. 

In its simplest form, our proposed phase retrieval method employs a modified lifting scheme akin 
to the one utilized by the well-known PhaseLift algorithm. In particular, it interprets quadratic 
magnitude measurements of x as linear measurements of a restricted set of lifted variables, XiX], 
for \j — i\ < 8 d. This leads to a linear system involving a total of (28 — l)d unknown lifted 
variables, all of which can then be solved for using only O(Sd) measurements. Once these lifted 
variables, XiX] for \j — i\ < 8 <C d, have been recovered, a fast angular synchronization method 
can then be used to propagate the local phase difference information they provide across the entire 
vector in order to estimate the (relative) phases of every entry of x. In addition, the lifted variables 
corresponding to XjXJ = \Xj \ 2 automatically provide magnitude estimates for each entry, Xj, of x. 
The proposed phase retrieval method then approximates x by carefully combining these entry-wise 
phase and magnitude estimates. 

Finally, we conclude by developing an extension of the proposed method to the sparse phase 
retrieval problem; specifically, we demonstrate a sublinear-time compressive phase retrieval algo¬ 
rithm which is guaranteed to recover a given s-sparse vector x £ (D d with high probability in just 
0(s log 5 s ■ log d)-time using only 0(s log 4 s • log d) magnitude measurements. In doing so we demon¬ 
strate the existence of compressive phase retrieval algorithms with near-optimal linear-in-sparsity 
runtime complexities. 


1. Introduction 

We consider the phase retrieval problem of recovering a vector x 6 <C d , up to an unknown global 
phase factor, from squared magnitude measurements 

(1) b := |Mx| 2 + n, 

where b E IR/ 0 is the vector of phaseless measurements (with D > d), M E <C Dxd is a measurement 
matrix, n E R^ denotes measurement noise, and, | • | 2 : <C D — > R D computes the componentwise 
squared magnitude of each vector entry. Our objective is to design a computationally efficient and 
robust recovery method, Am '■ R' 0 —> <D d , which can approximately recover x using the magnitude 
measurements b that result from any member of a relatively large class of local correlation-based 
measurement matrices M E (& Dxd . 

Phase retrieval problems of this form arise naturally in many crystallography and optics appli¬ 
cations (see, e.g., [521 ESI E3 [37]). As an illustrative example, Figure [I] presents a typical imaging 
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setup for a popular molecular imaging modality known as ptychography [25j. The figure shows a 
beam of electromagnetic radiation illuminating a small region of a test specimen. Under certain 
conditions (for example, if the wavelength of the incident radiation is of the same order as the atomic 
features in the specimen), the resulting diffraction pattern contains information about the atomic 
structure of the region being imaged. Therefore, by successively imaging shifts of the specimen, 
one might hope to deduce the complete atomic structure from such measurements. Unfortunately, 
it can be shown that the diffraction pattern measured by the detector corresponds to the (squared) 
magnitude of the Fourier transform of the specimen being imaged, meaning, among other things, 
that all phase information is lost. This is characteristic of many molecular imaging methods, where, 
either due to the underlying physics or due to instrumentation challenges, the detectors are not 
capable of capturing phase information. Therefore, a phase retrieval problem needs to be solved 
in order to recover the underlying atomic structure. For ptychography applications, in particular, 
and restricting our attention to one dimension, the acquired measurements are of the form 

bi(u) = \F [in Sex] (w)| 2 , 

where x is the specimen of interest, m is a localized illumination or window function (which depends 
on the optical setup), Se denotes a shift/translation operator, and J 7 denotes the Fourier transform. 
Note that the Fourier intensity measurement be(uj) corresponds to a specific translate, or shift, £ of 
the unknown specimen x. Furthermore, we may assume that supp(m) C supp(x) since the imaging 
field of view is typically much smaller than the support of the specimen. 



Diffraction Pattern 


Ptychographic Image 


Figure 1. A Typical Imaging Setup for Ptychography. Q 


Discretizing the problem on a uniform grid, we obtain 
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4mage credits: Qian, Jianliang, et al. “Efficient algorithms for ptychographic phase retrieval.” Inverse Problems 
and Applications, Contemp. Math 615 (2014): 261-280. 
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where x, m G <C d and (b u )g G IR is the analogous discrete measurement corresponding to a (circular) 
^-shift of the unknown vector x. Due to the finite field of view of the imaging system, we may also 
assume that fhk = 0 ,k > 5, where 5 G and 6 < d. Hence, we may writ^] 

2 r i 2 


(2) 


(b u ) t = 


<5-1 

E 

fc =0 


ink I 


<5-1 

E ; 

k =0 




Z7rn luk 

(m u)/- := e d mk . 


The imaging process thus involves collecting measurements {{b^g} for various Fourier frequencies 
and overlapping image space shifts C Z 2 3 n [0 ,d — l] 2 . Here, in ([2]), and throughout the 

remainder of the paper, it is important to keep in mind that the parameter 6 always corresponds 
to the width of each window/mask m w . The fact that this support size, <5, is significantly smaller 
that d is what leads to the locality of our proposed measurements. 

As we will see in §[2j our proposed measurement constructions are identical to ([2]) for a specific 
collection of Fourier frequencies and spatial shifts. We also note that ([2]) can be interpreted as 
squared magnitude correlation measurements of the unknown vector x with the local (since i5<(i) 
masks m^. In particular, 

(3) b w = |corr(m w ,x)| 2 , 

where G IR^ denotes diffraction measurements corresponding to frequency u and all possible 
image space shifts of the unknown vector x. 


1.1. Survey of Previous Work. Given the importance of phase retrieval in crystallographic and 
optical imaging methods, there is a rich history of research on this topic by scientists and practi¬ 
tioners across diverse fields. With regard to computational algorithms specifically, most popular 
methods in use today can trace their origins back to the alternating projection algorithms devel¬ 
oped in the 1970s by Gerchberg and Saxton [23], and Fienup 120]. These are iterative formulations 
which work by alternately requiring that the following two sets of constraints be satisfied: 

(i) (in data space) the current iterate has the same magnitude as that of the measured data, and 

(ii) (in image space) the current iterate satisfies certain problem-specific constraints such as pos¬ 
itivity or compact support. 

These algorithms are conceptually simple, efficient to implement (FFT-time for certain measure¬ 
ment constructions) and popular among the practitioners, despite the lack of a rigorous mathemat¬ 
ical understanding of their properties or global recovery guarantees. Indeed, given the non-convex 
nature of the phase retrieval problem, the lack of global convergence results for these methods is 
not surprising. The interested reader is referred to mm for some recent developments and a 
review of this family of methods, while [31. T6) [35] contain some specific applications of alternating 
projection algorithms to Ptychographic imaging. 

More recently, there have been significant efforts devoted towards developing phase retrieval 
algorithms which are (i) computationally efficient, (ii) robust to measurement noise, and (iii) theo¬ 
retically guaranteed to reconstruct a given vector up to a global phase error using a near-minimal 
number of magnitude measurements. For example, it has been shown that robust phase retrieval 
is possible with D = 0{d) magnitude measurements by solving a semidefinite programming relax¬ 
ation ( PhaseLift ) of a rank-1 matrix recovery problem jfl2. .8], This allows polynomial-time convex 
optimization methods to be used for phase retrieval. Furthermore, the runtimes of these convexity- 
based methods can be reduced with the use of O(dlogd) magnitude measurements [IT] , Other 
phase retrieval approaches include the use of spectral recovery methods together with magnitude 
measurement ensembles inspired by expander graphs [2]. These methods allow the recovery of x 

2 All indexing is considered mod d; x denotes the complex conjugate of x. 
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up to a global phase factor using 0(d) noiseless magnitude measurements or 0(d\ogd) noisy mag¬ 
nitude measurements, and run in fl(d 2 )-time in general]^] Finally, the recently proposed Wirtinger 
Flow algorithm [TO] and Truncated Wirtinger Flow (TWF) algorithm [15] employ stochastic gradi¬ 
ent descent schemes with special eigenvector-based initialization methods to recover x. The TWF 
algorithm, for example, recovers x using O(d) magnitude measurements robustly with computa¬ 
tional complexity |^](!}(.D<i log 1/e), where e is an accuracy parameter. All of these approaches utilize 
magnitude measurements |Mx| 2 resulting from either (i) Gaussian random matrices M, (ii) ran¬ 
dom masked Fourier-based constructions known as coded diffraction patterns (CDP) [26], or (iii) 
unbalanced expander graph constructions, in order to prove their recovery guarantees. 

1.2. Main Result. In this paper we demonstrate that a relatively general class of local correlation- 
based magnitude measurements allow for phase retrieval in just (D{d\og c d)-time{^] In particular, we 
construct a well-conditioned set of Fourier-based measurements which are theoretically guaranteed 
to allow for the phase retrieval of a given vector x E C d with high probability in 0(dlog 4: d)- 
tirne. These measurements are of particular interest given that they are closely related to, e.g., 
ptychography [45] . In particular, we prove the following theorem: 

Theorem. /Fast Phase Retrieval from Correlation-Based Measurements ) Let x e <D d with 
d sufficiently large have 11x11| > C (d lnd) 2 ln 3 (ln(i) ||n|| 2 j^] Then, one can select a random measure¬ 
ment matrix M € (C Dxd such that the following holds with probability at least 1 — f^| 

The proposed Phase Retrieval Algorithm [l] (BlockPR) will recover an x e with 
(4) min llx—® ie x|| < C"(d lnd) 2 ln 3 (lnd)||n ||2 

0e[O,27r) II II 2 

when given arbitrarily noisy input measurements b = |Afx| 2 + n E as per (ffl). Ffere D 
can be chosen to be 0(d ■ ln 2 (d) • In 3 (IncZ)). Furthermore, the proposed Algorithm ITT will run in 
0(d • ln 3 (d) • In 3 (In d))-time. 

To the best of our knowledge, this result provides the best existing error guarantee for correlation- 
based measurements. Moreover, this result also guarantees exact recovery, up to a global phase 
multiple, of x in the noiseless setting (i.e., when n = 0). Further, for a particular class of flal|^]vectors 
x, M can be chosen to be a deterministic matrix arising from local correlation-based measurements, 
and D = 3d measurements suffice for recovery in the noiseless setting. 

Numerical experiments both verify the speed and accuracy of the proposed phase retrieval ap¬ 
proach, as well as indicate that the approach is highly robust to measurement noise. Additionally, 
after establishing and analyzing our general phase retrieval method, we then utilize it in order to 
establish the first known linear-in-sparsity compressive phase retrieval method capable of recovering 
s-sparse vectors x (up to an unknown phase factor) in only 0(s log c d)-time. 


o 

Their runtime complexity is dominated by the time required to solve an overdetermined linear system. 

4 The Wirtinger flow algorithms are known to empirically work with random masked measurement constructions 
known as coded diffraction patterns (CDP) [9| [15] in essentially FFT-time, although no robust recovery guarantees 
are available for such measurements. 

^Herein c is a fixed absolute constant. 

^Herein C, C', C" € 1R + are all fixed and absolute constants. 

^Note that M is selected independently of both x and n. Furthermore, the nonuniform probability of success can 
be boosted to 1 — p for any p £ (0,1) at th e expense of introducing additional logarithmic factors of 1/p into the 
runtime and measurement bounds. See f4.3 for additional details. 


Here, “flat” sim ply m eans that there are no long strings of consecutive entries of x all of whose magnitudes are 


less than 


2Wd ' 


See S 4.2 


for additional details. 
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1.2.1. An Overview of the Proposed Approach for Local Correlation-Based Measurements. The pro¬ 
posed phase retrieval approach works in two stages in the ptychographic setting: During the first 
lifting (see, e.g., [12j 8]) stage, the quadratic magnitude measurements ([3]) are viewed as linear 
measurements of (2 5 — 1 )d new lifted variables, the entry-wise products x/xf for \j — i\ < 6 -C dj^] 
Given the correlation structure of the original measurements <©. the resulting lifted linear system 
in the new lifted unknowns x/xf is also highly structured. In particular, the resulting coefficient 
matrix of the lifted linear system turns out to be block circulant m, which both allows explicit 
condition number bounds to be derived for particular choices of windows m w , and also allows its 
fast numerical inversion using Fourier transforms. Thus, we are generally always able to (rapidly) 
invert the lifted linear system in order to solve for all local entry-wise products x/xf with \ j — i\ < 5. 
As an immediate consequence, the magnitude of each entry of x can also be estimated using, e.g., 
the products XjXf = \xj\ 2 . 

During the second angular synchronization (see, e.g., [39]) stage of the algorithm, the local 
entry-wise products x/xf, i / j, are used in order to estimate the relative phases of each individual 
entry of x. This is done by noting that the phase of each lifted variable x/3Lf, arg {x/xf), provides 
an estimate of the phase angle difference between the corresponding entries’ phases arg(xj) and 
arg {xj). Hence, the relative phases between all pairs of entries of x can be approximated by adding 
these local relative phase differences, provided by arg ( XiXf ) for | j — i\ < 5, together in appropriate 
(telescoping) sums. The only potential difficulty with this simple approach occurs when either Xi 
or Xj happens to be zero (or, more generally, very small in magnitude). In this case the phase 
difference given by arg {xiXf) is unreliable/unusable. In the worst possible case, when, e.g., x 
contains > 6 consecutive zero entries in a row, the available local phase differences can not span the 
corresponding gap in reliable phase difference information, and signal recovery (up to a single global 
phase ambiguity) becomes impossible. Much of the analysis of this stage focusses on categorizing 
and circumventing this fundamental difficulty. 

Combining both stages above allows one to obtain deterministic algorithms for recovering all 
x that do not contain any long contiguous strings of > 6 small entries with local correlation 
measurements. This result, which applies in the ptychographic setting, is ultimately developed in 
and stated in Theorem [5} Next, in §4.3[ it is shown that the worst case set of signals can still 
be recovered with high probability if one is allowed to precede one’s correlation measurements with 
an initial randomized global masking operation. Combining this result with Theorem [5] yields the 
main theorem stated above. Finally, once phase retrieval of sparse signals has been allowed by the 
employment of randomized masking in §4.3[ it becomes possible to develop sparse phase retrieval 
algorithms with near-optimal runtimes in lj6] 

The remainder of this paper is organized as follows: In Section [2] we establish notation and 
discuss important preliminary results. Next, in Section [3j we present our general phase retrieval 
algorithm and discuss it’s runtime complexity. We then analyze our phase retrieval algorithm and 
prove recovery guarantees for specific types of Fourier-based measurement matrices in Section [4] In 
Section [5j we empirically evaluate the proposed phase retrieval method for speed and robustness. 
Finally, in Section [6j we use our general phase retrieval algorithm in order to construct a sublinear¬ 
time compressive phase retrieval method which is guaranteed to recover sparse vectors (up to an 
unknown phase factor) in near-optimal time. Section [7] concludes with several suggestions for future 
work. 


14.2 


®Here, i — j £ {—d/2 ,..., —1, 0,1,..., d/2} is always considered modulo d. Note that these particular (2 5 — 1 )d 
entry-wise products are exactly all those that appear when the squared magnitude measurements (J2| are multiplied 
out for all shifts l (treating x as periodic). 
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2. Preliminaries: Notation and Setup 


For any matrix X G d C Dxd we will denote the j th column of X by X ? G (D 42 . The conjugate 
transpose of a matrix X G will be denoted by X* G <D dx£> , and the singular values of any 

matrix X G (& Dxd will always be ordered as cr\(X) > cr 2 (X) > ■ ■ ■ > fT m i n (/o.rf) (X) > 0. Also, the 
condition number of the matrix X will denoted by k(X) := (?i(X) / o^^D^iX). We will use the 
notation [n] := {1, ..., n} C N for any n G N. Finally, given any x G <D d , the vector x s opt G <D rf will 
always denote an optimal s-sparse approximation to x. That is, it preserves the s largest entries in 
magnitudes of x while setting the rest of the entires to 0. Note that x s opt G <C d may not be unique 
as there can be ties for the s th largest entry in magnitude. 


2.1. An Illustrative Example. Before we write down the setup for the general case, we present a 
simple but illustrative example which highlights the structure of our proposed measurements, and 
provides a general overview of the reconstruction algorithm. For simplicity, let us assume that we 
are given noiseless measurements in this section so that 

b = |Mx| 2 , 

with x G <D 4 , b G R 12 , and M G C 12x4 . Further, let us assume that the measurement matrix M 
has a block-circulant structure of the form 


( ( m i)i 
0 
0 


M = 


(mi) 

(m 2 ) 

0 

0 

(m 2 ) 


( m i ) 2 

( m i)i 

0 

0 

(m 2 ) 2 

( m 2 )i 

0 

0 


(m3)! (m 3 ) 2 

0 ( m s)i 

0 0 
0 


0 

0 

( m l) 2 

0 

( m l)l 

(mi). 

0 

(mi) 

0 

0 

( m 2) 2 

0 

( m 2)l 

(m 2 ): 

0 

(m 2 ) 

0 

0 

( m 3) 2 

0 

( m s)i 

( m s): 

0 

( m s) 



\ ( m 3) 2 

where Mi, M 2 , M% G C 4x4 are circulant matrices and mi,m 2 ,m 3 G C 4 are masks or window 
functions with finite support. In particulaip^ (m f)i = 0 for i = 3,4 and t G {1,2,3}. The astute 
reader will note that this construction describes correlation measurements of the unknown vector 
x with local masks m^; i.e., b = |corr(m^>, x)| 2 , t G {1,2,3}. Writing out the correlation sum 
explicitly and setting 6 = 2, we obtain 

<5 


(5) 


(Pt)i = 


' X i+k-l 


k= 1 


M)G{ 1, 2,3,4} x {1,2,3}. 


For a suitable choice of mask such as 


{ <B~ k / a 

^ 25-1 

0 


• <E 


25-1 


if k < 5 
if k > 5 


with a := max < 4, 


5-1 


we see that ([5j is of the form of ([2j in ) ]lj w hich described ptychographic measurements. These are 
exactly the measurements analyzed in §4.1| 


4( ^Tlie notation (me)i denotes the i -th entry of the £-th mask. 
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Having summarized our measurement construction, we now turn our attention to describing the 
reconstruction algorithm, which can be conceptually divided into the following two steps: 

(i) Estimate (scaled) local phase differences of the form {TjXk \ j- k E [d], \j — k mod d\ < 5}. 

(ii) Recover the phases of each entry of x, and consequently x itself, by using the estimates of the 
local phase differences from (i). 

To obtain the phase differences in step (i) above, we start by rewriting the squared correlation 
measurements <© as follows: 


(Mi 


5 2 8 

n • zi+k-i = (m^)j %i+j— %i+k —1 

k =1 j,k =1 


8 

^ -Ei+j— 1 I? 

j,k=l 


where we have used the notation (m g)j k := (m^)j(m^) fc . This is a linear system of equations for 
D = (25 — 1 )d = 12 phase differences, y G C 12 , with 

y=[|xi| 2 X\X 2 X 2 Xl \x 2 \ 2 X 2 X 3 X 3 X 2 \x 3 \ 2 X 3 X 4 X 4 X 3 |x 4 | 2 X 4 X 1 XlX 4 ] T . 


By defining b G R 12 to be the interleaved vector of measurements (obtained by permuting the 
entries of b) 

b= [( 61)1 ( 62)1 (b 3 )i (bi) 2 (b 2 ) 2 ( 63)2 ( 61)3 (£> 2)3 (^ 3)3 (^ 1)4 (^ 2)4 (b 3 )4] T , 

we may write the resulting linear system as M'y = b, where 


\ra4u 

( m i j 1,2 

(nil 2 2,1 

(mi 

2,2 

u 

U 

u 

u 

u 

U 

U 

U \ 

(m 2 )i,i 

(m 2 )i, 2 

(1112)2,1 

(m 2y 

2,2 

0 

0 

0 

0 

0 

0 

0 

0 

(m 3 )i,i 

(1113)1.2 

(1113)2,1 

(ms, 

2,2 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

(mi, 

1,1 

( m i)i, 2 

(1111)2,1 

(1111)2,2 

0 

0 

0 

0 

0 

0 

0 

0 

(m 2 , 

1,1 

(m 2 )i,2 

(1112)2,1 

(1112)2,2 

0 

0 

0 

0 

0 

0 

0 

0 

(m 3 , 

1,1 

(m 3 )i, 2 

(1113)2,1 

(m 3 ) 2 ,2 

0 

0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

( m i)i,i 

(lUl) 1,2 

(1111)2,1 

(1111)2,2 

0 

0 

0 

0 

0 

0 


0 

0 

(m 2 ) 1,1 

(1112)1,2 

(m 2 ) 2 ,i 

(1112)2,2 

0 

0 

0 

0 

0 

0 


0 

0 

(m 3 )i,i 

(1113)1,2 

(m 3 ) 2 ,i 

(1113)2,2 

0 

0 

(1111)2,2 

0 

0 

0 


0 

0 

0 

0 

0 

( m l)l,l 

( m i)i, 2 

(1111)2,1 

(m 2 ) 2 ,2 

0 

0 

0 


0 

0 

0 

0 

0 

(m 2 )i,i 

(m 2 )i, 2 

(m 2 ) 2 ,i 

41113)2,2 

0 

0 

0 


0 

0 

0 

0 

0 

(m 3 )i,i 

(1113)1,2 

(m 3)2,1/ 


We draw attention to the block-circulant structure of M', composed of two blocks M( , M! 2 G C 3x3 
highlighted using dashed lines. Much of the speed and elegance of the proposed algorithm arises 
from this block-circulant structure: not only can such systems be inverted in essentially linear-time, 
they also lend themselves to easy analysis. As we will see in Section 4.1, we can explicitly write 


out condition number bounds for M' resulting from certain classes of measurement masks. 

Note that by solving this linear system, we automatically recover the magnitude of x. In partic¬ 
ular, 


Fi I = Vi, 


F 2 I =2/4, 


F3 — 2/7, F4| “ 2/10- 


Additionally, after normalizing (to unit magnitude) the entries of y, we also recover phase difference 
estimates, 

<t>i,j ■= arg (xj) - arg(xi), i.j G {1,2, 3,4}, | i-j mod 4| = 1. 

For example, (f>\^ = arg(x 2 ) — arg(xi) = arg( 7 / 2 /1 2 / 2 1)- We can now recover arg(x) as required by 
step (ii) of our two step recovery algorithm by using a greedy procedure. 
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Assume, without loss of generality, that |xi| > \xi\, i E {2,3,4}. We start by setting arg(.xi) = 
Op] We may now set the phase of X2 and X4 using the estimated phase differences (ft 1.2 and cfti,4 
respectively; i.e., 

arg(x 2 ) = arg(xi) + ^ 1 , 2 , arg(x 4 ) = arg(xi) + </>i, 4 . 

Similarly, we next set arg(.X 3 ) = arg(x 2 ) + <^ 2 , 3 , thereby recovering all of the entries’ unknown 
phases. We note that the computational cost of this procedure is essentially linear in the problem 
size d. Section [3] contains a more detailed description of the algorithm, while Section [4] includes 
theoretical recovery guarantees. 


2.2. General Problem Setup. Hereafter we will assume that our measurement matrix M E 
<C Dxd has D := (26 — 1 )d rows corresponding to local correlation-based measurements, where 
6 E N represents the support size of the associated correlation masks. Furthermore, we will utilize 
the decomposition of M into its (25 — 1) circulant blocks, Mi,..., M 25-1 E <D dxd , given by 


( 6 ) 


M = 


( Mi 

M 2 


\ 


\ M 2 s~i ) 


Here each Mi E (& dxd will be both circulant, with 


CO (-Mi)ij . (m;)(j_j) mo d d -)- 1 

for a mask np E <D d , and banded, so that (m;)j = 0 for all i > 5, and 1 < l < 25 — l p| 

As a consequence of this structure, the squared magnitude measurements from the i th -block, 
|M;x| 2 E R rf , can be rewritten as 

s 

(8) (|M z x| 2 ). = (M/x) i (M|x) i = Xj+i-iXh+i-i- 

j,k =1 

Let y E C D be defined by 

( 9 ) Vi := X [^frP[^frl+((i+5- 2 ) mod (25—1))—<5+1- 

Furthermore, let 0 Q E R lxa be the row vector of a zeros for any given a E N, and let rrm J) G ClX5 
be such that 


( 10 ) (™(ij))fc := ( m «)i( m 0 fe- 

We can now re-express |M;x| 2 E R d from ([8]) as M;y, where Mi E (& dxD is a (25 — l)-circulant 
matrix defined by 


/ 

“(M) 

0 , 5-2 

“(/, 2 ) 

0 < 5—2 

*(i, 3 ) 


m ( l,< 5 ) 0 

0 

0 

\ 


0 2 5 —1 

m (u) 

0 , 5-2 

m(i, 2 ) 

0 , 5-2 

*(i, 3 ) 


0 

0 


V 

(“( 1 , 2 ) ) 2 



0+2 

*(i, 3 ) 

0 , 5-2 

0 

m(M) 

0 ( 5—2 (m (Zi2 ) 

)J 


^ Recall that we can only recover x up to an unknown global phase factor which, in this case, will be the true 
phase of X\. 

12 Every integer modulo d is considered to be an element of {0, 1} i n q2.2| Furthermore, all indexes of 

vectors in C d will be considered modulo d, + 1, as per |t]| for the remainder of j]‘2.2[ 












Finally, after reordering the entries of (Mx^ via a permutation matrix P G {0,l}' DxZ5 , we arrive 


at our final form 


(ID 


P |Mx| 2 = My = 


/ 


m' 2 


M's 

0 

0 . 

. 0 


0 

M[ 

M' 


M's 

0 . 

. 0 

V 

m' 2 


M's 

0 


0 . 

. M[ 


y- 


Here M' G <C DxD is a block circulant matrix [50] whose blocks, M [,..., M' s G C^ 2<5 1 ) x ( 25 1 ) ) have 
entries 


( 12 ) 


(M/)*j : = 




0 


ifl<j<5 — Z + l 

if S - l + 2 < j < 26 - l - 1 


(mj)z + i(mj) /+J -_ 2(5+1 if 28 - l < j < 2d - 1, and l < 5 
0 if j > 1, and l = 5 


Let I a denote the a x a identity matrix. We now note that M' can be block diagonalized via 
the unitary block Fourier matrices U Q G 0 “ dxad,^ parameter a G N, defined by 


( la I, 


(13) 




... In 


27ri 

27ri-(d—1) 

I a e d 

. . I a <£ d 

27ri- (d— 2) 

27ri-(d—2)-(d —1) 

I a e d 

. . I a <£ d 

27ri-(d—1) 

27ri-(d—l)-(d—1) 

I a e d 

. . I a <E d 


\ 


More precisely, one can see that we have 


/ Ji 0 0 

0 J 2 0 


(14) 


U26-1 M ' U 2S—i = J: = 


0 \ 
0 


0 0 0 Jd —i o 

\ 0 0 0 0 Jd J 


where J G <C DxD is block diagonal with blocks J\, ■ ■ ■ , Jd G C^ 2<5 1 ) x ( 25 L given by 

<5 

% , 27ri- (fc — 1)-(Z — 1) 

(15) J k := ]T M/ • £-*-. 

i=i 

Not so surprisingly, the fact that any block circulant matrix can be block diagonalized by block 
Fourier matrices will lead to more efficient computational techniques below. 


2.3. Johnson-Lindenstrauss Embeddings and Restricted Isometries. Below we will utilize 
results concerning Johnson-Lindenstrauss embeddings 32) [22, 1, 16, [3, 33j of a given finite set 
S C into <D m for m < d. These are defined as follows: 

Definition 1. Let e G (0,1), and S C C d be finite. An m x d matrix A is a linear Johnson- 
Lindenstrauss embedding of S into <D m if 

(1 — e) || u — v ||1 < || Au- A\ ||| < (1 + e)|| u- v ||| 

holds Vu, vg5U {0}. In this case we will say that A is a JL(m,d,e)-embedding of S into <D m . 

Linear JL(m,d,e)-embeddings are closely related to the Restricted Isometry Property [13, 3l !2Tj]. 
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Definition 2. Let s E [d] and e E (0,1). The matrix A E (C mxd has the Restricted Isometry 
Property if 

(16) (1 — e )ll x ||| < || Ax ||| < (1 + e)|| x ||| 

holds Vx E containing at most s nonzero coordinates. In this case we will say that A is RIP(s,e). 

In particular, the following theorem due to Krahmer and Ward [Ml EH demonstrates that a matrix 
with the restricted isometry property can be used to construct a Johnson-Lindenstrauss embedding 
matrix. 


Theorem 1. Let S C <C d be a finite point set with |<S| = M. For e,p E (0,1), let A E <P mxd be 
RIP(2s,e/C\) for some s > C 2 • ln(4M/p)p* *| Finally, let B E {—1,0, \} dxd be a random diagonal 
matrix with independent and identically distributed (i.i.d.) symmetric Bernoulli entries on its 
diagonal. Then, AB is a JL(m,d,e)-embedding of S into <D m with probability at least 1 — p. 


Below we will utilize Theorem[T]together with a result concerning the restricted isometry property 
for sub-matrices of a Fourier matrix. Let F E <C dxd be the unitary d x d discrete Fourier transform 
matrix. The random sampling matrix , R' E <D mxd , for F is then 


(17) 



where R E {0,1 } mxd is a random matrix with exactly one nonzero entry per row (i.e., each entry’s 
column position is drawn independently from [d] uniformly at random with replacement). The 
following theorem is proven in 

Theorem 2. Let p E (0,1). If the number of rows in the random sampling matrix R' E <D mxd 
satisfies both 


(18) 


m s In 2 (8s) ln(8d) 

ln(9m) ~ 3 e 2 


and 

(19) 


m > C 4 


slog(l/p) 


then R' will be RIP(2s,e/C\) with probability at least 1 — pp*] 


We are now prepared to present and analyze our phase retrieval method. 


3. BlockPR: A Fast Phase Retrieval Algorithm 

The proposed phase retrieval algorithm ( BlockPR ) works in two stages. In the first stage, the 
vector y E from Q of local entrywise products of x E G d with its conjugate is approximated 
by solving the linear system 0- That is, we compute (M r ) 1 Pb where b E are our noisy 
local correlation-based measurements from (jT]) , and M' and P are as in ©• This yields 

(20) y := (M / )” 1 Pb = (M / ) _1 P |Mx| 2 + (M / ) _1 Pn = y + (M') _1 Pn. 

where y is as in Q. 

Next, a greedy algorithm is used to recover the magnitudes and phases of each entry of x (up to 
a global phase factor) from our estimate of y. To see how this works, note that y will contain all 
of the products XiXj for all i, j E [d] with |(i — j) mod d\ < As a result, the magnitude of each 

13 Here C-\, C 2 € (l,oo) are both fixed absolute constants. 

34 See Theorem 12.32 in Chapter 12. 

* ,r ’Here (73,(74 € (l,oo) are both fixed absolute constants. 

^®In ^ 3 ] and below we always consider any integer modulo d to be in the set {— Tf ] + 1, • ■ •, LiJIi f° r ^ < L^J • 
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entry Xj can be estimated using the entry of y corresponding to XjXj = \xj\ 2 . Similarly, as long 
as both XjXj > 0 and XiXi > 0 hold, one can also compute the phase difference arg(xj) — arg (xj) 
from arg (xiXj). Thus, the phase of each x* can be determined once arg (xj) is established for a 
neighboring entry. Repeating this process allows one to determine a network of phase differences 
which all depend uniquely on the choice of a single entry’s unknown phase. This entry’s phase 
becomes the global phase factor ® 10 from Q. See Algorithm [I] for additional details. 


Algorithm 1 BlockPR 


Input: Local Correlation Measurements b £ R~° (Recall ([I])) 

Output: x £ <C d with x « e 10 x for some 6 £ [ 0 , 2 tt\ 

1 : Compute y := (M')~ 1 Pb (see (20)) 

2 : Use Algorithm tel with input y £ <C D to compute 4>j ~ (j)j := arg (xj) for all j £ [d] 

3: Set Xj = \f\yj'\<P L<t>j Vj £ [d], where yy (computed in line 1) is s.t. yy = XjXj + ((M')~ 1 Pn) 


Recall from (20) that we have access to 


( 21 ) 


y = y + n £ c 


D 


where n := (. M') 1 Pn results from the measurement noise, and D = (25 — 1 )d. Thus, from Q we 
have an index k(i,j ) £ [D] for all i, j £ [d] with \(i — j ) mod d\ < 5 such that 

( 22 ) Vk(i,j) XiXj + n k ( i}j y 

We will utilize the index function k for y defined by (22) in Algorithm [2j 


It is important to note that Algorithm [l] assumes that the block circulant matrix M' arising 
from our choice of measurements, M, is invertible. As we shall see in f[4] and ^5j this is relatively 
easy to achieve. Similarly, Algorithm [2] implicitly assumes that y does not contain any strings of 
5—1 consecutive zeros (or, more generally, <5—1 consecutive entires with “very small” magnitudes). 
This assumption will also be discussed in lj4] and fj5j and justified for measurements arising from 
arbitrary x by modifying the measurements M. For the time being, then, we are left free to consider 
the computational complexity of Algorithm [l] 

3.1. Runtime Analysis. We will begin our analysis of the runtime complexity of Algorithm [l] by 
considering the computation of y £ <C D in line 1. Recalling ^2j we note that the permutation matrix 
P is based on a simple row reordering that clusters the first rows of Mi,..., AI 2 S -1 into a contiguous 
block, the second rows of Mi,..., M 2 S -1 into a second contiguous block, etc. (see ([b]) and Q). 
Thus, P b is simple to compute using only 0(d ■ d)-operations. To finish calculating y = (M') _r Pb 
we then use the decomposition of M' from (14) and compute y = U 25-1 J _1 t/| (5 _ 1 Pb. 


Recalling the definition of U 28-1 (13), one can see that both C/ 25-1 and U^-i have fast matrix- 
vector multiplies (i.e., because they can be computed by performing 25—1 independent fast Fourier 
transforms on different sub-vectors of size d ). Hence, matrix-vector multiplies with both of these 
matrices can be accomplished with 0(5 ■ dlogd) operations. Finally, J is block-diagonal with d 
blocks of size (25 — 1) x (25 — 1) (see ([15])). Thus, J and J 1 can both be computed using 0(d ■ 5 3 ) 
total operations. Putting everything together, we can now see that line 1 of Algorithm [l] requires 
only 0(d ■ 5 3 + 5 ■ d log d) operations in general. Furthermore, these computations can easily benefit 
from parallelism due to the fact that the calculations above are all based on explicitly defined block 
decompositions. 

The second line of Algorithm [I] calls Algorithm [2] whose runtime complexity is dominated by 
its main while-loop (lines 7 through 19). This loop will visit each entry of the input vector y at 
most a constant number of times. Hence, it requires 0(5 ■ d) operations. Finally, the third line 
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Algorithm 2 Greedy Angular Synchronization 

Input: y k {ij) = XiXj Vi, j E [d] with |(i - j) mod d| < 6. 

Output: Phase angles: (pj ~ cfj := arg(xj) for all j E [d] (up to a global phase). 

1: % Estimate largest magnitude entry and set its phase to zero. 

2: a := argmaxj \y k (j,j)\ = argmax^ (| \xf 2 + |) , j = 0,..., d - 1. 

3: 4> a <— 0 % Note: We approximate the unknown phases up to a global phase factor. 

4: %Define a binary vector, phaseFlag E {0, l} d , to keep track of entries whose phase has already 
%been set. This ensures that each phase is estimated once, and then not changed again. 

phaseFlagj = 


JO, i = a, 

J 1, else. 


5: Initialize j E- a; 

6: % Estimate phases of all entries of x 
7: while phaseFlagj > 0 do 

8: % Estimate phases of 26 — 1 entries nearest current Xj 

9: for i = 1 — 6 , 2 — 6 ,..., 0,. .., 6 — 1 do 


10: % Do not over-write previously estimated phases 

11: if phaseFlag i+i mod d == 1 then 


12: %Use the current reference phase, cfj, and the input phase difference estimates, 

% arg (y k (j+i mod d, j)) ~ arg (x j+i mod dXj ) and arg j-\-i mod d)) ~ 

% arg ( XjXj + i mod d ), to estimate the phase of the entry Xj + i mod d . 

13: &j+i mod d t 4>j 2 (arg ( y k (j-\-i mod d, j;)) arg j+i mo d d)))> 


14: % Remember that the phase of entry Xj+i mod d has now been estimated 

15: phaseFlag i+i mod d E- 0; 


16: end if 

17: end for 

18: % Update the reference entry to be the largest neighboring entry of the current Xj 


19: end while 


j + arg max 

o <i<5 


I Vk(j+i 


mod d , j+i mod d) 


mod d\ 


of Algorithm [I] uses only O(d) operations. Thus, the total runtime complexity of Algorithm [T] is 
0(d ■ 5 3 + 6 ■ d log d) in general. 

4. Error Analysis and Recovery Guarantees 

In this section we analyze the performance of the proposed phase retrieval method (see Algo¬ 
rithm [l|, and demonstrate measurement matrices which allow it to recover arbitrary vectors, up 
to an unknown phase factor, with high probability. Our analysis proceeds in two steps. First, in 
§4.1| and §4.2| we demonstrate the existence of a deterministic set of correlation-based measure¬ 
ments, M E (C Dxd , which allow Algorithm [I] to recover all relatively flat (i.e., non-sparse) vectors 
x E <D d . Herein, “flat” simply means that there are no long strings of consecutive entries of x all 
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of whose magnitudes are less than ^=-. The proposed measurements M are Fourier-like, roughly 
corresponding to a set of damped and windowed Fourier measurements of overlapping portions of 
x. In addition to being well conditioned, these Fourier measurements also have fast inverse matrix- 
vector multiplies via (an additional usage of) the FFT. Hence, they confer additional computational 
advantages beyond those already enjoyed by our general block circulant measurement setup. 

Next, in §4.3| we extend our deterministic recovery guarantee for flat vectors to a nonuniform 
probabilistic recovery guarantee for arbitrary vectors. This is accomplished by right-multiplying M 
with a concatenation of several Johnson-Lindenstrauss embedding matrices, each of which tends to 
“flatten out” vectors they are multiplied against. In particular, we construct a set of such matrices 
which are both (i) collectively unitary, and (ii) rapidly invertible as a group via (yet another usage 
of) the FFT. The fact that this flattening matrix is unitary preserves the well conditioned nature of 
our initial measurements, M. Furthermore, the fact that the flattening matrix enjoys a fast inverse 
matrix-vector multiply via the FFT allows us to maintain computational efficiency. Finally, the 
fact that the flattening matrix produces a flattened version of x with high probability allows us to 
apply our deterministic recovery guarantee for flat vectors to vectors which are not initially flat. 
The end result of this line of reasoning is the following recovery guarantee for noisy measurements. 


Theorem 3. Let x E with d sufficiently large have ||x|| § > C (d In d) 2 In 3 (In d) || n|| 2 Then, 
one can select a random measurement matrix M G (^ Dxd independently ofboth x and n such that 
the following holds with probability at least 1 — g , lll 2 ^^ 1 hl 3 ( lnt ^ •' Algorithms will recover an x G 
with 


(23) 


min 

6>e[0,27r] 


x — e l6l x 


< C"{d lnd ) 2 ln 3 (lnd)||n ||2 


when given arbitrarily noisy input measurements b = |Mx| 2 + nE as per 0 . Here D can be 
chosen to be 0(d • ln 2 (d) • In 3 (Incl)). Furthermore, Algorithm [i] will run in 0{d ■ ln 3 (<f) • In 3 (lnd))- 
time. 


Note that the error bound in (23) is probably sub-optimal due to, e.g., the appearance of the 
d? polylog d-term on its righthand side. It is certainly the case, at least, that stronger error bounds 
are achievable using complex normal random measurements in combination with optimization tech¬ 
niques (see, e.g., mm)- None the less, Theorem [3] provides the best existing error guarantee the 
authors are aware of for local correlation-based measurements, and computational experiments in¬ 
dicate that Algorithm [l] is highly robust to measurement noise in practice (see fj5]). Furthermore, 
it is important to point out that Theorem [3] also guarantees exact recovery, up to a global phase 
multiple, of x by Algorithm [I] in the noiseless setting (i.e., when n = 0). Finally, Algorithm [I] 
is fast, with a runtime complexity that is near-linear in d. We are now ready to begin proving 
Theorem [3l 


4.1. Well Conditioned Measurements. In this section we develop a set of deterministic mea¬ 
surements M G C Dxd that lead to well conditioned block circulant matrices M' G <C DxD in ©■ 
To begin, we choose a G [4, 00 ) and then set our local correlation masks to be 


(24) 


(m;)i 


e 

v / 2iWT 

0 


27ri-(i —1)-(Z —1) 
. e 2JV1 


3 "Herein C,C',C" € it + are all fixed and absolute constants. 
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if l < 6 
if i > 5 














for 1 < l < 25 — 1, and 1 < i < d. This leads to blocks M{ G 1 ) x ( 2<5 L from (12) with entries 
given by 

if 1 < j < S - l + 1 

0 if d - Z + 2 < j < 2 d - Z - 1 




if 2d^ l < j < 2d - 1, l < 5 

0 


if j > 1, and l = 5 

We will now begin to bound the condition number of this block circulant matrix, M', by block 
diagonalizing it via (14). 

Considering the entries of each J k G (p;(2<5-i)x(2<5-i) from (15) results in two cases. First, suppose 
that 1 < j < 6. In this case one can see that 


(25) 


(26) 


(Jk)j j 


®( 1 -j)/“ -2^-1).Q-l) 5 


1 ’ J v/25^1 


E 

z=i 


®-0'+l)/« - 27 ri-(i-l)-Q-l) 1 _ (£-2(<5-j + l)/<2 . £ 

® 2S ~1 ■ - 


—21/a ^.(fc-D-Q-i) 

(B , 

2-7ri-(fc —!)•(<$—j + 1) 


. 2-7ri-(fc —1) 

1 — ®~ 2 /“ • ® d 


V26^1 

Second, suppose that 6 + l<j< 25 — 1. In this case one can see that 


e-O'- 2 ^- 1 ))/“ -2ni-(i-l)(j-26) 

(27) (Jfcb , = - ^-- • ® 


<5-1 


(28) 


V25^1 

<E —(2(^+1 )~j)/a 


E * 

l=2S-j 


o; l n 2iri-(fc-l).(i-l) 
• (P d 


0 / . r\ / 27ri-(fc — l)*(j — <5) 

— 27rn-(a — l)-f j —1) 27rn-(fc —1)(25—j —1) 1 — £ - Z U~°)/ a . £ d 

® 25—i . m jpg • - 


, 2«ri-(fc— 1) 

1 — ® Z / a • ® d 


V25^1 

Let F a G <D QX " be the unitary a x a discrete Fourier transform matrix. Defining 

X _ e -2(S-j+l)/a . ( g27ri-(fc—1)-(<5—j+1)/ d 


® 


-(j+l)/a 


Sk,j 


<B -(2(8+l)-j)/a _ ( g27ri-(fe—1)(25—j —1)/<Z _ 

we now have that 


^ _0—2/a . 027ri-(/c—l)/d 

1 _ <e- 2(j-<5)/a . (E 27ri-(fc-l)-(j-5)/d 


(29) 


•4 = ^2 <5—1 


1 - ®“ 2 / a ■ ®2iri-(fc-l)/d 

... 0 \ 


if 1 < j < <5 

if d + 1 < j < 2 d — 1 


^ 'Sfc,! 0 

0 s fcj 2 0 

0 0 0 

\ 0 ... 0 Sfc,2<5—1 ) 

Note that the condition number of J, and therefore of M 1 , will be dictated by the singular values 

of these J*, matrices. Thus, we will continue by developing bounds for the singular values of each 
J k 6 C (2«-l)x(2tf-l)_ 

The fact that F^-i is unitary implies that 


(30) 


min \s k ,j\ < (725-1 (Jk) < <7i (Jk) < max \s k A 

ye [2<5—l] je[2tf—i] 


for all k G [d]. Thus, we will now devote ourselves to bounding the maximum and minimum values 
of \s k j\ from above and below, respectively, over all k G [d] and j G [2d — 1]. These bounds will 
then collectively yield an upper bound on the condition number of our block circulant measurement 
matrix M'. The following simple technical lemmas will be useful. 


Lemma 1. Let x G [2, oo). Then, 1 —« 1 ' x > 


2 —(E 1 / 1 


> 2-v^ v. 7 
— x ^ 20x " 
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> 


1 

X 


> 


. Furthermore, 

□ 


Proof: Note that 1 - ® l ' x = £~ =1 j ■ (2 - E,T=o a "(n+i)! 

the numerator is a monotonically increasing function of x. 


2-<b 1 / x 


Lemma 2. Let a,b,c E R + , and / : R —>• 1R below. Then, 

(1) f(x) = b ■ « _X//a (l + c • ® 2a; / a ) das a unique global minimum at x = —| ln(c), and 
(£) /(x) = b ■ <s~ x / a (l — c • ® 2l / a ) is monotonically decreasing. 

Proof: In either case we have that f'{x ) = — | • ® -3; / a ± ^ • ®‘ T / a , and f"{x) = ^ • ® -3; / a ± • ® 3; / a . 

For (1) we have a single critical point at x = — ^ ln(c), which is a global minimum since f"{x) 

0 Vx E R. For (2) we have f'{x) < 0 for all x E R. 


Note that 


(31) |sfcj| = < 


-0'+i)/“ Z 1 + <s~MS-j+L/ a - 2e“ 2 ( <5 W+i)/o cos (2% ■ [5 - j + 1] • (k - l)/d) ^ < < ^ 

1 + e -4 / a — 2©“ 2 / a cos (2n(k — l)/d) 


-(2(5+l)-j)/a _ 


I + 0 4 0 s )/ a — 2 e 2 0 5 )/ a cos{2-k ■ [j — 8\ ■ (k — 1)/d) 


1+0 4 / a — 2 e 2 / a cos (2n(k — l)/d) 


if 5 + 1 < j < 25 - 1 


Fix k E [d]. When 1 < j < <5 we have 


(32) 


max I Si. j I < max ® 
ie[<5] J jm 1 


-(j+D/a . l + ®- 2(m ~ j)A ^ < ®- 2/a (l + ®- 2 ^) 


1 — ® 2 / a 


1 — ® 2 / a 


where the second inequality follows from part one of Lemma [2j When S + 1 < j < 26 — 1 we have 


(33) max |sr,-| < max 

je[2S-i]\[S] je[26-l]\[6] 


® 


— (2(5+1)—j)/ 


_ j)/a 1 + <B-20-5)/a^ ^ (E-S/o^ + e -2(5-l)/a) 


1 — ®~ 2 / a 


1 - ® — 2/a 


where the second inequality again follows from part one of Lemma [2j Finally, combining (32) and 

(33) one can see that 

/T . ®- 2 /“(l + ®- 25 /“) ®- 2 / a (l + ®- 25 /“) 20®~ 2 /“ _ 2/a 

(34) a 1 ( J k ) < - / __ 9/ „ -- < a- ^—- < a--- < 3a • « 2/a , 


1 - ® — 2/a 


2(2 - ® 2 / a ) 


where the second inequality follows from Lemma [I] with a E [4, 00 ). 

Turning our attention to the lower bound, we note that part two of Lemma [2] implies that 


(35) 


min|sfcj| > min ® ^ + l)/a ■ 


1 _ £-2(5+1 —D/a \ (g-^+L/a^ _ 0 -2/a) 


J'e[c5] 


/e[<5] 


1 + ®- 2 /a 


> 


1 + ®-2/ a 


Similarly, part two of Lemma [2] also ensures that 


(36) min 1st,-1 > min 

ie[25-i]\[5] u je[28-l]\{S\ 


y -(2(5+l)-j)/a . 


1 _ e -2 (j~5)/a' 
1 + ®~2/a 


> 


® 


— (<5+l)/a(i _ 0 -2/a^ 


1 + ® _ 2/a 


Combining (35) and (|36|) we see that 
(37) 


®-(^)/“(l-®- 2 /“) ^ 7 (S+ l)/a 

CT25 - l(Jfc) - -l^®^7o- > 20 a'® ’ 


where the second inequality follows from Lemma [T] with a E [4, 00 ). We are now equipped to prove 
the main theorem of this section. 
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Theorem 4. Define M' E <C DxD via (24) with a := max {4, Then, 


(M ; ) < max 1144® 


2 5f.(i-i) 2 


Proof: We have from 
(38) re (AT') = 


and (37) that 
<ti (M') 


cri (J) 


< 


cjd (AT) an (J) 

Minimizing the rightmost upper bound as a function of a yields the stated result. 


maxfcg^] <Ji (Jfc) 2 . (5—i)/a 

min fce[d] <725-1 (4) 


□ 


Theorem [4] guarantees the existence of measurements which allow for the approximation of the 
phase difference vector y E C 13 defined in ([9]). In the next three subsections we analyze the 
approximation of x E <D d from y ~ y via the techniques discussed in ^3| 

4.2. A Recovery Guarantee for Flat Vectors. As mentioned above, Algorithm [2] implicitly 
assumes that y E <C D does not contain any strings of 5 — 1 consecutive entires (mod d ) all with very 
small magnitudes. In this section we will demonstrate that a general class of non-sparse vectors 
x E <C d lead to such y whenever noise levels are low enough. As a result, we will prove deterministic 
approximation guarantees for all sufficiently non-sparse vectors x E in the high signal-to-noise 
setting. More specifically, we will utilize the following concrete characterization of flatness (i.e., 
non-sparsity) for x hereafter. 

Definition 3. Let m E [d]. A vector x E <D d will be called m-flat if can be partitioned into at least 
l_mJ blocks of consecutive entries such that: 

(1) Every block contains either m or m + 1 neighboring entries of x, and 

(2) Every such block of entries contains at least one entry whose magnitude is > 

The following simple lemma proves that Algorithm [I] accurately estimates the magnitude of every 
entry of x. Both it and the next lemma use the notation from © and ( |22[ ), and implicitly assume 
the invertibility of M'. 

Lemma 3. Let |xj| = \J\Vk{j,j)\ = \J\ x j x j + ™k(j,j )I be the estimate of \xj\ utilized in line 3 of 
Algorithm ^ J where n := (LL')~ 1 Pn . Then \\xj\ — \xj\\ 2 < SHnH^. 

Proof: Let a := \xj\ E R + and e := 'hfc(jj) £ <D. We upper bound the righthand side of 


= a 2 + |a 2 + e| — 2ay / |a 2 + e| < 2a 2 + |e| — 2 a 2 


i + 4 


by considering two cases: If o 2 < |e| we may bound the rightmost negative term by zero in order 
to obtain the desired result. If a 2 > lei then 


- 2 a 2 


i+4 

a z 


so that 


\Xp — li 


< -2a z \H- ^ < -2a 2 fl- 


< 2a 2 + |e| - 2a 2 1 - = 3|e 


again holds as desired. 


□ 


The next lemma proves that Algorithm [2] will accurately estimate the phases of all entires Xj 
whose magnitudes are sufficiently large relative to the noise level. This lemma requires that the 
vector x be m-flat for a sufficiently small m. 
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Lemma 4. Suppose that x e <C d is [^pj -flat with d > 2 and 11x|11 > 26 d 2 ||n||oo; where n = 
(M') _1 Pn as in (21). Let k 6 [d] 6 e such that \xk\ 2 > | d ||n||oo; and (f a € [0, 2tt] he the true phase 
of the entry x a of x chosen in line 2 of Algorithm [2| Then, Algorithm will produce an estimate 
(fk of the true phase of Xk, (fk ■= arg (xk), satisfying 




< 


10 d 2 


n 


Proof: Let a := 


and /3 := Note that a > 1, |x fc | > f, ||n||oo < 3 ^, and \x a \ > ^f3 


all hold. Furthermore, the fact that x is [^ppj-flat guarantees the existence of a sequence of entries 
Xb t ,..., Xb q such that: 

( 1 ) 0 < | — bi ) mod d| < 5 — 1 for all l G [q — 1 ], 

( 2 ) 0 < max{|( 6 i — a) mod d|, |(k — b q ) mod d|} <5 — 1 , and 

(3) | Xb t \ > /3 for all l €[q\. 

Thus, line 18 of Algorithm [2] is guaranteed to identify a sequence of entries x n ,... such that: 

(1) 0 < \(ji +1 - ji ) mod d| <5-1 for all l G [q - 1], 

( 2 ) 0 < max{|(ji — a) mod d|, \ (k — j p ) mod d|} < 5 — 1 , and 

(3) lajjJ > for all l G [q\. 

These entries contribute to the estimate 4>k of (fk — (fa given by Algorithm [ 2 J 


( v -1 


(39) (f k - 4>k,jp+ T, +« 


' 31 , 0 . 


, 1=1 





T {^(fj 1 (fa ) — (fk ( 


where (f ]l+un := 5 (arg (yfc ( J;+1 J; )) - arg J(+1 ))) is used as an estimate in line 13 of (f 3l+u3l := 
aig {Uk(ji +1 ,ji)) — fiji+i ~~ (fjr 

To bound the approximation error in (39) we note that the law of sines implies that 

'|n||oo 


sin 


Jl+i ,31 


*3l+l,3l\ I - 


sm 


Jl+i ,31 


Cjl+Ul 


< 


I m> 


where yy = Xj l+1 Xj t + hy is, without loss of generality, the smaller of the two measurements (in 
magnitude) that contribute to the estimate (fj l+1 ,j r Using the facts from the previous paragraph 
concerning |x a |, \xk\, and x 3l | for all l E \p], one can show that every measurement yy that 
contributes to an estimate f>j l+l , 3l used in (39) will have \yy\ > Hence, we have that 

2 a:_ 1 

I 2 


sm 


’3l+i,Jl V3l+l,3l\) — R2 H 11 !! 00 < 4 


holds for all l 6 \jp\. As a consequence, we can infer that 

bji+iji ~ (< Pji+i ~ $ 3 . 






. 5 . 

< - sm 
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3l+i,Jl 


l '3l+l,3l\ i — 


5 a 

< 2 n 


also holds for all all I 6 [p]. Combining this with (39) we learn that 
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(fk ~ 4>k + (fa 


^ ,5 a 

- d 2?l |n|1 
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This, in turn, implies that 

0^0fc _ 0^(0fc 0a) _ 0^(01: 0fe _ t _ 0a) _ 3 


= 2 sin ( - 
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~ 4>k + 1 
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<fk (fk T (fa 
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proving the lemma. 


□ 


We are now prepared to prove the main result of this section. 

Theorem 5. There exist fi xed u niversal constants C,C' E R + such that following holds: Let 

> 


M E <C Dxd be defined as in § | j.l\ and suppose that x E <D d is J -flat with d > 2 and 
C (5 — l)d 2 ||n|| 2 - Then, Algorithm [l] is guaranteed to recover an x E <D rf with 


x 


2 — 


(40) 


mm 

6»S[0,27r] 


x - ® i0 x 


< C’d\8- l)||n|| 2 


when given arbitrarily noisy input measurements b = |Mx| 2 + n E R c as per ([Tj) . Furthermore, 
Algorithm^ requires just 0(6 ■ dlogd) operations for this choice of M E <C Dxd . 

Proof: First, note that M’ will be invertible by Theorem |4j Thus, we may set n = (M / ) _1 Pn. 
Furthermore, the proof of Theorem [4] tells us that there exists an explicit universal constant C" E 
R + such that 


(41) 


C"(8 — 1) ||n|| 2 > 


n 2 


> n 2 > n 


&D (M>) 

Setting C = 26 C" now allows us to verify that 

11 x 11 2 > 26 C" (6 — 1 )d 2 11 n || 2 > 26 d 2 || fiHoo. 

Hence, we will be able to apply both Lemmas [3] and [4] as desired. 

Let i f a E [0, 27t] be as in Lemma [4j and let p,p E G d be the vectors of phases of the entries of 
(g—i<p 0 . X and x, respectively, so that p 3 = and pj = hold for all j E [d] . Similarly, recall 

that |x|, |x| E R d denote the vectors of magnitudes of the entries of x and x, respectively, so that 
\x\j = \xj\ and \x\j = \J\Vk(j,j)\ = \J\ x j x j + I hold for all j E [d] (here we are again using the 
notation from (21) and (22)). Thus, ®~ 1< ^“x = |x| o p and x = |x| o p both hold, where o denotes 


the entrywise (Hadamard) product. 

To obtain ( f40| ) we bound 

||®-^x — x|| 2 < || |x| op — |x| o p|L + || |x| o p — |x| o p| 


(42) 



@*0j — <g*(0i 0a) 




| X o’ | \X n 
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\ E 

\ jm\s 


Xj 


gi0j — 0^(0j 0a) 


\/ 3cZ|| nHoo, 


where S := {k E [d] | |.Tfc| 2 > | d (Inline}, and the last inequality results from an application of 
Lemma [3j Bounding the first two terms of ( |42[ ) using Lemma [4] and the properties of S 
that 


we can see 


® ^“X — x|| 2 < 


\ j 


Ei 


jes 


10 d 2 


n 


x 


T y/ 10d 2 11n11cx3 + yj 3d|| 


n 


< 2d^/lOUnlloo + < 3d-\/l0||n| 


Using (41) now allows us to establish that 


|® 1<?ia x — x||2 < 3d\/l0jjfi||^ < 2>d\JlQC"(5 — 1)||n ||2 
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which implies (40). 

We finish by noting that the runtime complexity of Algorithm [I] simplifies to 0(5 ■ dlogd) op¬ 
erations when using the measurements defined in §4.1| because the matrix J also has a simple 
block-diagonal factorization in this case (recall (14) and (|29[) in light of 


□ 


The following corollary easily follows from Theorem [5} 


Corollary 1. Let M G (D' Dxd be defined as in {f.l, and suppose that x G <C d is m-flat for some 
m < |_^J . Then, Algorithm [f] will recover an x G <D rf with x = c l6l x for some 6 G [0, 27r] when 
given noiseless input measurements b = |Mx| 2 G Furthermore, Algorithm [7] requires just 

0(5 ■ dlogd ) operations in this case. 

Of course, not all vectors are m-flat for a suitably small value of m. We will generalize our results 
to arbitrary vectors x in the next section. This will be accomplished by showing that a well chosen 
random unitary matrix W will have the property that Wx is m-flat with high probability. 

4.3. Flattening Arbitrary Vectors with High Probability. Let W G C dxd be the random 
unitary matrix 

(43) W := PFB , 

where P G {0, l} rfxd is a permutation matrix selected uniformly at random from the set of all 

d x d permutation matrices, F is the unitary d X d discrete Fourier transform matrix, and B G 

{—1,0, l| rfxci i s a random diagonal matrix with i.i.d. symmetric Bernoulli entries on its diagonal. 

For any given m G [d], one can naturally partition W into blocks of contiguous rows, each 

of cardinality either m or m + 1. This defines the 1^1 sub-matrices of W, W i...., W. i _d i G 

1 L m J 

<D( m+ i) xd and W d _ m y A j + i,..., G <C mxd , by 


(44) 


W = 


( w 1 
w 2 


V w [£i ) 

Note that each renormalized sub-matrix of W, ■ Wj for j G [|_^J], is almost a random 
sampling matrix (17) times a random diagonal Bernoulli matrix. As a result, Theorems [I] and [2] 
suggest that each \J £ ■ Wj should behave like a JL(?n,d,e)-embedding of our signal x into C m (or 

C m+1 ). If true, it would then be reasonable to expect that each block of m consecutive entries of 
Wx should have roughly the same t^-norm as one another. This, in turn, suggests that the random 
unitary matrix W should effectively flatten x with high probability, especially when m is small. 

Of course, there are several small difficulties that must be addressed before the argument above 

can be made rigorous. First, the rows of F contributing to \J~^ ■ Wj are effectively independently 

sampled uniformly without replacement from the set of all rows of F by our choice of p0 This 
means that Theorem [2] does not strictly apply in our situation since we can not select any row 
of F more than once. Secondly, some care must be taken in order to select the smallest value of 
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Note that W being unitary helps us to be able to guarantee both exact recovery of x in the noiseless setting, 
and well behaved approximation of x in the noisy setting. If we had chosen rows of F with replacement instead 


of without replacement in (431 we would not have a unitary (or even invertible) square matrix W with probability 
—> 1/® as d —> oo. If one decides to make W rectangular instead of square simply in order to allow sampling from 
the rows of F with replacement, then many other small difficulties and inefficiencies result. Thus, we let P be a 
randomly selected permutation matrix instead of creating it by putting a one in each row independently at random. 
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m possible in (44), since Wx will become flatter, or less sparse, as m decreases. As a result, nn 
will effectively provide a theoretical lower bound on the size of 5 that one can utilize and still be 
guaranteed to accurately recover Wx via our ^techniques (recall also ( |l. 2 | above). We are now 
ready to begin proving our main result concerning W. 

The following simple lemma will be used in order to help adapt Theorem [2] to the situation where 
the rows of F are sampled uniformly without replacement. 


Lemma 5. Let m G N with m < \fd. Independently draw x\,..., x m from [d] uniformly at random 
with replacement. Then, P [|{.ti, ... ,x m }\ = m] > 1/2. 


Proof: A short induction argument establishes that 

771—1 / . \ 771—1 . O 

(45) P[|{x 1 ,...,, m }|=m]=n( 1 -^)^-E^ = 1 -^ 

j= i ^ ' j = 1 

The result now follows easily via algebraic manipulation. 


□ 


The following corollary of Theorem [ 2 ] now demonstrates that a random sampling matrix R' 
formed by sampling a subset of rows of size m uniformly at random from F will still be RIP(2s,e/C'i) 
with high probability. 


Corollary 2. Let p 6 (0,1). Form a random sampling matrix R' G C mxd by independently 
sampling m rows from F uniformly without replacement. If the number of rows, m, satisfies both 


(46) 


Vd > m > C, ■ 

e 1 


and 


(47) 


Vd > m > C 4 ■ 


s l°g( 2 /p) 


then R' will be RIP(2s,e/C\) with probability at least 1 — p. 


Proof: Let S := {x\,... ,x m }, where each Xj G [d] is selected independently and uniformly at 
random from [d] (with replacement). Similarly, let S' C [d] be a subset of [d] chosen uniformly 
at random from all subsets of [d] with cardinality m (i.e., let S' contain m elements sampled 
independently and uniformly from [d] without replacement). Furthermore, let E denote the event 
that the random sampling matrix whose rows from F are x\,... ,x m is not RIP(2s,e/C'i). Finally, 
let E' denote the event that the random sampling matrix whose rows from F are the elements of 
S' is not RIP(2s,e/C'i). Applying Lemma [ 5 ] we can now see that 

(48) P [E] > P [E | |S| = m] ■ P [|«S| = m\ = P [E'] ■ P [|«S| = m\ > ^ • P [E’] . 

The stated result now follows from Theorem 1 □ 


We are now ready to prove that W will flatten the signal x G with high probability provided 
that m can be chosen appropriately. We have the following theorem: 

Theorem 6. Let W G <C dxd be formed as per (43) for d > 8 . Then, Wx G <D d will be m-flat with 
probability at least 1 — ^ provided that yfd > m + 1 > C 5 • ln 2 (d) • In 3 (Ind) J^| 

Proof: Our first goal will be to show that each W \,..., W\ &_ 1 from (44) is a is a rescaled JL(m,d,l/2)- 

L m J 

embedding of {x} into C m (or <D m+1 ). This will guarantee that each consecutive block of m (or 
m + 1 ) entries of Wx has roughly the same ^ 2 -norm. 


3 ®Here C 5 G 1R + is a fixed absolute constant. 
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To achieve this goal we will apply Theorem |lj to each yj ^ ■ W\ , • • •, y] ^ • W|_d j in order to show 
that each one embeds {x} into <D m (or <D m+1 ) with probability at least 1 — The union bound 
will then imply that {x} is embedded by all the yj~^ • Wj with probability at least 1 — 2 ^. This 

argument will go through as long as each J■ W\B~ l , • ■ ■ j y ~ ■ W i d_ i B~ l is RIP(2s,l/2Ci) for 

V V L m J 

some s>C 2 - ln(8d). Hence, we will now focus on determining the range of m which guarantees 
that all J of these matrices are RIP(|"2C2 • ln(8d)] ,l/2Ci). 

To demonstrate that each yj~^ ■ \VjB~ 1 is RIP(|"2C2 • ln(8d)],l/2C l i) with probability at least 
1 — ^ one may apply Corollary [ 2 ] with m (or m + 1) chosen as above (assuming d > 8). Another 
application of the union bound then establishes that all of J — ■ W\B ~ l , • ■ ■ , \ • Wi _d 1 B _1 will be 

RIP( [ 2 C 2 dn(8d)] ,l/2Ci) with probability at least 1 — h—. One final application of the union bound 
then establishes our first goal: All of yj~^- Wi, . .., yj~^ ■ Wyd_ j will be JL (m.ci, 1 /2)-embeddings of 
{x} with probability at least 1 — 

To finish the proof, we now note that Wx will be m-flat whenever all \_^\ of the \J^ • Wj 
matrices are JL(m,d,l/2)-embeddings of {x}. To see why, suppose that 

-|| x llo < —||Wjx||| < -|| x |||. 

2" 112 “ m" J 112 “ 2" 112 

This implies that ^j\\ x || 2 > ll^-xlll — ^Il x ll 2 ! which can only happen if both of the following 
hold: (i) at least one entry of WjX has magnitude at least , and (ii) all entires of WjX 

have magnitude less than yj 


' 3m+3 I 


2d 


x 112 = 


— yl^^\\Wx\\ 2 . This proves the theorem. 


□ 


Theorem [6] now allows us to alter our measurements so that we can recover arbitrary vectors. 
We are now ready to prove Theorem [3j 


4.4. Proof of Theorem [3 W e set our measurement matrix M £ <C Dxd to be M := MW where 
M £ C' Dxd is defined as in i 4.l[ and W £ (C dxd is as defined as in (43). Theorem [b] guarantees that 


Wx will be m = 0(ln z (d) ■ hr (Inci))-fiat with probability at least 1 — 


provided that 


C5-ln 2 (cZ)-ln 3 (ln d) 

d is sufficiently large. Furthermore, if VFx is m-flat and 5 > 2m + 3, then Theorem [5] guarantees 
that Algorithm [I] will recover an x 7 £ <D d satisfying 


(49) 


mm 

ee[o,27r] 


Ifx — ® 10 x 7 


< C d z (5 - l)||n|| 2 


when given the noisy input measurements b = \MWx\ 2 + n £ R^ 1 . Hence, choosing <5 = 0(ln 2 (d) ■ 
In 3 (lnd)) allows us to recover x 7 ~ W fe^x), for some unknown phase < f £ [0, 2ir], with probability 


at least 1 — 


C 5 -ln 2 (d)-ln 3 (ln d) 
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Setting x = TT*x 7 we can see that 


mm 

6»e[0,27r] 


x — ® 10 x 


= mm 
2 6»e[0,27r] 

= min 

0G[O,2tt] 


w* 


Wx — e l9 x 


Wx - <s ie x' 


< C'd 2 (5 — 1)||n|| 2 


by (49) since W is always unitary. 


2 ^The probability estimate in Theorem [ 3 ] follows immediately with C = C 5 . 
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Considering the runtime complexity, we note that x 7 can be obtained in 0(6 ■ dlogd) = 0(d ■ 
ln 3 (d) • In 3 (lnd)) operations by Theorem^ Computing W*^ can then be done in 0(d\ogd) oper¬ 
ations via an inverse fast Fourier transform. The stated runtime complexity follows. 

It is interesting to note that alternate constructions of flattening matrices, W, with fast inverse- 
matrix vector multiplies can also be created by using sparse Johnson-Lindenstrauss embedding 
matrices in the place of our Fourier-based matrices (see, e.g., [7]). Thus, one has several choices of 
matrices W to use in concert with a given block-circulant measurement matrix M in principle. 

5. Empirical Evaluation 

We now present numerical results demonstrating the efficiency and robustness of the BlockPR 
algorithm. We test our algorithm on i.i.d. zero-mean complex random Gaussian test signals. To 
test noise robustness, we add i.i.d random Gaussian noise to the squared magnitude measurements 
at desired signal to noise ratios (SNRs). In particular, the noise vector n E in |l]) is chosen to 
be i.i.d. J\f(0, a 2 lf)). The variance a 2 is chosen such that 

SNR (dB) = 10 log 10 ■ 

Errors in the recovered signal are also reported in dB with 

Error (dB) = 10 log 10 f —r^- 

V IMI2 

where x denotes the recovered signal. Matlab code used to generate the numerical results is freely 
available at M- 




(a) Execution time - Phase Retrieval from D = 7d (b) Execution time Phase Retrieval from D = 
measurements. \2d\og 2 d~\ measurements. 

Figure 2. Computational Efficiency of the BlockPR Phase Retrieval Algorithm 
(WF denotes the Wirtinger Flow algorithm; parentheses are used to denote the 
type of measurements used) 
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We start by presenting numerical simulations demonstrating the efficiency of the BlockPR algo¬ 
rithm. In particular, we plot the execution time for solving the phase retrieval problem (averaged 
over 100 trials) from perfect (noiseless) measurements in Figure [ 2 ] Simulations were performed on 
a laptop computer with an Intel® Core™i3-2350M processor, 6GB RAM and Matlab R2015a. For 
comparison, we also plot execution times for the Gerchberg-Saxton alternating projection algorithm, 
semidefinite programming-based PhaseLift algorithm]^] and the recently introduced Wirtinger Flow 
algorithm. Simulation results with PhaseLift and the Gerchberg-Saxton alternating projection algo¬ 
rithm use random complex Gaussian measurements. We note that even though alternating projec¬ 
tion algorithms are known to empirically work with certain FFT-time measurement constructions, 
recovery guarantees are only available for random Gaussian measurements (see for example [42]). 
Simulation results for the Wirtinger Flow algorithm are generated using either random complex 
Gaussian measurements or coded diffraction patterns(CDPs). We remark here that the motivating 
application typically determines the type of measurements used. For example, random Gaussian 
measurements or coded diffraction patterns - such as those employed by PhaseLift and Wirtinger 
Flow - are not directly applicable in local correlation-based applications, which are of primary 
interest in this discussion. Nevertheless, Figure [2] provides a useful comparison for evaluating the 
efficiency of different phase retrieval algorithms. Figure [2a] plots the execution time for solving 
the phase retrieval problem using 7 d measurements. We observe that PhaseLift and alternating 
projections have computational complexities which scale cubically with the problem dimension, 
thereby limiting their applicability to small-scale problems. On the other hand, both BlockPR and 
Wirtinger Flow (with coded diffraction pattern measurements) have essentially FFT-time com¬ 
putational complexities, with BlockPR observed to have a smaller constant than Wirtinger Flow. 


Similar speedup is observed in Figure 2b where the number of measurements used by BlockPR 
and Wirtinger Flow grows log-linearly in the problem size d. For reference, execution times for 
phase retrieval from 0(d ) random complex Gaussian magnitude measurements using PhaseLift 
and Wirtinger Flow are also provided. These plots confirm the significant speedup offered by the 
proposed BlockPR algorithm over other comparable methods and corroborate the efficiency of the 
proposed computational framework for solving large-scale phase retrieval problems. 


We next demonstrate robustness of the proposed framework to additive noise. Figure 3a plots 
the reconstruction error in recovering a d = 64 complex random Gaussian signal at different SNRs, 
with each data point computed as the average of 100 trials j^] We include reconstruction results 
using Gerchberg-Saxton, PhaseLift and Wirtinger Flow algorithms for comparison. In all cases, 
the algorithms recover the unknown signal x upto (or slightly better than) the added noise level. 
As opposed to the other algorithms in Figure [3aJ BlockPR uses local measurements as prescribed 


in (24), with each measurement only providing information about a corresponding local region of 
the underlying signal. Hence, we expect BlockPR to require more measurements than the other 
algorithms in Figure [3a] to achieve the same noise robustness. Indeed, this is confirmed in the 
figure with BlockPR requiring roughly twice the number of measurements to achieve the same 
noise robustness. We note, however, that the number of additional measurements is typically a 
small constant irrespective of the problem size. Similar results are provided in Figure [3b| for a larger 
problem size ( d = 1024). Until now, the deterministic Fourier-based measurement masks of (24) 


have been utilized to generate simulation results. In Figure [3b[ we additionally provide robustness 
results using oversampled random local masks. Specifically, we choose the non-zero entries of each 
of the d measurement masks E in ([t]) to be i.i.d. complex random Gaussian entries for 


i = 1,2, ■ ■ • ,7 • (2 5 — 1), where 7 > 1 is an oversampling factor. The dashed fine in Figure 3b 


was 
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The PhaseLift algorithm was implemented as a trace-regularized least-squares problem using CVX |25i 1241 . a 
package for specifying and solving convex programs in Matlab. 
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A few iterations of the Gerchberg-Saxton alternating projection algorithm were used to post-process the 
reconstructions. 
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Robustness to Additive Noise, d = 64, D = 7d 


Robustness to Additive Noise, d = 1024 




20 25 30 35 40 45 50 55 60 10 20 30 40 50 60 

Noise Level in SNR (dB) Noise Level in SNR (dB) 


(a) Robustness to Additive Noise (d = 64). (b) Robustness to Additive Noise (d = 1024). 

Figure 3. Robustness of the Proposed BlockPR Phase Retrieval Algorithm to Ad¬ 
ditive Noise 

generated using 7 = 1.5, while the solid line was generated using 7 = 2 . In either case, the block 
length 6 was chosen so as to achieve a total of D = [ 6 dlog 2 d | and D = [T2dlog 2 d] measurements 
respectively. This plot confirms the graceful degradation of reconstruction error with added noise. 
Moreover, the block length <5 and oversampling factor 7 may be suitably chosen to achieve desired 
noise robustness. 



Figure 4. Well Conditioned Measurements - Condition Number as a Function of 
the Block Length 5 

Finally, Figure [4] plots the condition number of the system matrix used to solve for the phase 
differences (matrix M' in Q. The figure plots the condition number as a function of the block 
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length 6 for d = 64^] It confirms that the condition number scales as a small multiple of S 2 . The 
figure also includes a plot of the condition number when using random masks at an oversampling 
factor of 1.5. Empirical simulations suggest that the use of oversampling can lead to essentially- 
linear growth in the condition number k(M') with block length 5. Analyzing the performance of 
such an oversampled measurement construction and explicitly writing down associated condition 
number bounds would be an interesting avenue for future research. 


Execution Time, D = [4rflog 2 d\ 




No. of Measurements (in multiples of d) 


(a) Execution Time vs. Problem Size (Local Cor¬ 
relation Measurements). 


(b) Reconstruction Error vs. Number of Measure¬ 
ments (d = 64, i.i.d. Gaussian noise). 



10 20 30 40 50 60 

Noise Level in SNR (dB) 


Robustness to Measurement Noise, d = 64, D = 15 d 



(c) Robustness to Additive Noise (d = 64, D = 7d, (d) Robustness to Additive Noise (d = 64, D = 15d, 

(Fourier-like) Local Correlation Measurements) (Fourier-like) Local Correlation Measurements) 

Figure 5. Evaluation of Phase Retrieval Algorithms for Local Correlation Measurements 


23 


The condition number is independent of the problem dimension d and depends only on the block length 5. 
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We conclude this section by presenting representative numerical results highlighting the superior 
performance of the proposed BlockPR algorithm for local measurements. Specifically, we use the 
windowed Fourier-like local correlation measurements described in Section 4.1 with the BlockPR , 
PhaseLift and Gerchberg-Saxton alternating projection algorithms. We note that there are no the¬ 
oretical recovery guarantees with PhaseLift and the Gerchberg-Saxton algorithm for such measure¬ 
ments; however, empirical simulations suggest that for sufficiently large numbers of measurements, 
both algorithms recover the unknown signal. Indeed, this is illustrated in Figure 5b, where the 


reconstruction error in recovering a d = 64 length complex vector is plotted as a function of the 
number of measurements required. For global (random Gaussian) measurements, it is known that 
PhaseLift recovers signals upto the noise level with about 6 d measurements (see, for example |12|). 
Yet, for the local measurements studied here, significantly larger numbers of measurements are 
necessary for successful phase retrieval by PhaseLift (and indeed the Gerchberg-Saxton alternating 
projection algorithm). Figure 5b shows that the proposed BlockPR algorithm compares well with 
PhaseLift and almost always outperforms alternating projections. We note that the PhaseLift error 
plots are only shown for upto D = 15 d measurements due to memory issues in implementing the 
algorithm for larger D with CVX on a laptop computer. 

The performance of the proposed BlockPR algorithm is particularly noteworthy since it has an 
essentially FFT-time computational cost as illustrated in Figure 5a Unsurprisingly, the compu¬ 


tational cost of PhaseLift scales as 0(d 3 ). Note that we had to use the Matlab software package 
TFOCS [U [5] to implement PhaseLift for large problem sizes. TFOCS implements fast first-order 
conic solvers which trade off some accuracy for efficiency gains, thereby enabling the solution of 
large problems. We also note that the computational cost of the Gerchberg-Saxton alternating 
projections algorithm scales quadratically despite the use of FFT-based linear system solvers in 
each iteration. Numerical experiments suggest that the number of alternating projection iterations 
grows with the problem size, thereby resulting in the quadratic scaling. 

For completeness, Figures [5c| and [5d| plot the reconstruction error versus added noise level using 
D = 7d and D = 15 d measurements respectively for all three algorithms using local correlation 
measurements. In each plot, we recover a d = 64 length complex vector from i.i.d Gaussian noise 
corrupted phaseless measurements. With D = 15 d measurements, both BlockPR and PhaseLift 
recover the unknown signal to the level of noise, while for D = 7d measurements, there is a roughly 
KMB error in the reconstructed signal. Note the increased number of measurements necessary for 
PhaseLift to achieve the same error performance as in Figure 3a - this highlights the challenging 
nature of reconstructing signals from local phaseless measurements. 


6. Extension: Sublinear-Time Phase Retrieval for Compressible Signals 

In this section we briefly focus on the compressive phase retrieval setting, (see, e.g., mmm 
EH DU 08]), where one aims to approximate a sparse or compressible x £ <D d using fewer magnitude 
measurements than required for the recovery of general x. It is known that robust compressive 
phase retrieval for s-sparse vectors is possible using only 0(slog(cZ/s)) magnitude measurements 
mum. In this section we prove that it is also possible to recover s-sparse vectors x £ <C d up to an 
unknown phase factor in only 0(s log 6 d)-time using 0(slog°d) magnitude measurements. Thus, 
we establish the first known nearly runtime-optimal (i.e., essentially linear-time in s) compressive 
phase retrieval recovery result. In particular, we prove the following theorem. 

Theorem 7. There exists a deterministic algorithm A : R^ —> for which the following holds: 
Let e £ (0,1], x £ with d sufficiently large, and s £ [d]. Then, one can select a random 
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measurement matrix M £ <C Dxd such that 


(50) 


mm 

ee[o,2jr] 


did 


x — A 


Tie 


< 


X — X 


opt I 


+ 


x — X 


opt 

(s/ £ ) 


is true with probability at least 1 — 
• In d) 


C-ln 2 (d) -In 3 (In d) 

Furthermore, the algorithm will run in O 


^ Here D can be chosen to be C>(|-ln 3 (^)-ln 3 (in 
ln 4 (|) • In 3 (in • In d)-time in that casep’] 


S 

e 


We prove Theorem [7] by following the generic compressive phase retrieval recipe presented in 
Let C £ C mxrf he any compressive sensing matrix with an associated sparse approximation algo¬ 
rithm A : C m -»• <C d (see, e.g., PH H3 ED ESI E1IH1 El]), and let P £ C Dxm he any phase retrieval 
matrix with an associated recovery algorithm <F : 1R D —> C m . Then, Ao$ : R-° —>• <C d will approx¬ 
imately recover compressible vectors x £ <D rf up to an unknown phase factor when provided with 
the magnitude measurements |PCx|. That is, one may first use d* to recover e^(Cx) = C'(® 1<?i x) 
for some unknown £ [0, 27r] from |PCx|, and then use A to recover ® 1 '^ > x from C'(® 1<?i x). If both 
d> and A are efficient, the result will be an efficient sparse phase retrieval method. 

Herein we will utilize Algorithm [T] as our phase retrieval method. Note that it’s runtime is only 
O (m log 4 m) , making it optimal up to log factors (recall Theorem [ 3 ]) . For the compressive sensing 
method we will utilize the following algorithmic result from 


Theorem 8. Let e £ (0,1], a £ [2/3,1), x £ C d , and s £ [d\. With probability at least a the 
deterministic compressive sensing algorithm from [28] will output a vector z £ satisfying 

opt 


22e 


(51) 


x — z 


I2 — 


x — X 


opt I 


+ 


X — X 


(s/e) 


V~s 


when executed with random linear input measurements A4x £ <D m . Here m = O ■ In 
suffices. The required runtime of the algorithm is O • In In ) * n this case 

Theorem [7] now follows easily from Theorem [3] with n = 0, and Theorem [8] 


s/e 

1 — <J 
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In d 


7. Future Work 


This paper provides the first known global robust recovery guarantees for ptychographic phase 
retrieval problems, as well a new numerical lifting-1-angular synchronization solution approach which 
is applicable with correlation-type measurements resulting from a wide class of locally supported 
masks m^,. In doing so, it opens up many additional avenues of research. Examples include, e.g., 
questions associated with the use of different boundary conditions for x. Herein, x is assumed to be 
periodic, which leads to the lifted linear system matrix M' in © being block circulant. It would be 
interesting to derive condition number bounds, as well as specialized numerical solution techniques, 
for the lifted linear system matrices that result from alternate boundary condition assumptions. 
More generally, it would be interesting to derive theoretical condition number bounds for the types 
of lifted linear systems that arise from more general classes of masks m w under various sets of 
boundary condition assumptions for x. 

It would also be interesting to explore alternate approaches, besides Algorithm [2j for solving 
the angular synchronization problem via local phase differences that appears in the second stage of 
our proposed phase retrieval approach. In particular, it would be interesting to develop theoretical 
error bounds for other solution methods (e.g., [HJ|) in the presence of noise for these types of local 

94 -i- 

Here C £ 1R + is a fixed absolute constant. 

2 °For the sake of simplicity, we assume s = fl(logd) when stating the measurement and runtime bounds above. 

2 ®For the sake of simplicity, we assume s = fl(logd) when stating the measurement and runtime bounds above. 

27 























angular synchronization problems. Finally, it would also be instructive to carefully consider the 

optimality (or lack thereof) of the robustness guarantees derived herein for ?n-flat signals (i.e., 

Theorem [ 5 ]) under various noise models. 
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